A Comparative Study of Demographic Attribute Inference in Twitter

نویسندگان

  • Xin Chen
  • Yu Wang
  • Eugene Agichtein
  • Fusheng Wang
چکیده

Social media platforms have become a major gateway to receive and analyze public opinions. Understanding users can provide invaluable context information of their social media posts and significantly improve traditional opinion analysis models. Demographic attributes, such as ethnicity, gender, age, among others, have been extensively applied to characterize social media users. While studies have shown that user groups formed by demographic attributes can have coherent opinions towards political issues, these attributes are often not explicitly coded by users through their profiles. Previous work has demonstrated the effectiveness of different user signals such as users’ posts and names in determining demographic attributes. Yet, these efforts mostly evaluate linguistic signals from users’ posts and train models from artificially balanced datasets. In this paper, we propose a comprehensive list of user signals: self-descriptions and posts aggregated from users’ friends and followers, users’ profile images, and users’ names. We provide a comparative study of these signals side-by-side in the tasks on inferring three major demographic attributes, namely ethnicity, gender, and age. We utilize a realistic unbalanced datasets that share similar demographic makeups in Twitter for training models and evaluation experiments. Our experiments indicate that self-descriptions provide the strongest signal for ethnicity and age inference and clearly improve the overall performance when combined with tweets. Profile images for gender inference have the highest precision score with overall score close to the best result in our setting. This suggests that signals in self-descriptions and profile images have potentials to facilitate demographic attribute inferences in Twitter, and are promising for future investigation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparative Study of Multi-Attribute Continuous Double Auction Mechanisms

Auctions have been as a competitive method of buying and selling valuable or rare items for a long time. Single-sided auctions in which participants negotiate on a single attribute (e.g. price) are very popular. Double auctions and negotiation on multiple attributes create more advantages compared to single-sided and single-attribute auctions. Nonetheless, this adds the complexity of the auctio...

متن کامل

A Weighted Combination of Text and Image Classifiers for User Gender Inference

Demographic attribute inference of social networking service (SNS) users is a valuable application for marketing and for targeting advertisements. Several studies have examined Twitter-user gender inference in natural language processing, image recognition, and other research domains. Reportedly, a combined approach using text data and image data outperforms an individual data approach. This pa...

متن کامل

Forecasting Industrial Production in Iran: A Comparative Study of Artificial Neural Networks and Adaptive Nero-Fuzzy Inference System

Forecasting industrial production is essential for efficient planning by managers. Although there are many statistical and mathematical methods for prediction, the use of intelligent algorithms with desirable features has made significant progress in recent years. The current study compared the accuracy of the Artificial Neural Networks (ANN) and Adaptive Nero-Fuzzy Inference System (ANFIS) app...

متن کامل

Demographic Inference on Twitter using Recursive Neural Networks

In social media, demographic inference is a critical task in order to gain a better understanding of a cohort and to facilitate interacting with one’s audience. Most previous work has made independence assumptions over topological, textual and label information on social networks. In this work, we employ recursive neural networks to break down these independence assumptions to obtain inference ...

متن کامل

Gender Inference of Twitter Users in Non-English Contexts

While much work has considered the problem of latent attribute inference for users of social media such as Twitter, little has been done on non-English-based content and users. Here, we conduct the first assessment of latent attribute inference in languages beyond English, focusing on gender inference. We find that the gender inference problem in quite diverse languages can be addressed using e...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015